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SUMMARY 

A nucleotide sequence of the yeast Saccharomyces cerevisiae omnipotent suppressor SUP2 (SUP35) gene is 
presented. The sequence contains a single open reading frame (ORF)of 2055 bp, which may encode a 76.5-kDa 
protein. A single transcript of 2.3 kb corresponding to a complete ORF is found. Analysis of codon bias suggests 
that the SUP2 gene is not highly expressed. The C- terminal part of the deduced amino acid sequence shows 
a high homology to yeast elongation factor EF-lo, whereas the N-terminal part is unique for the SUP2 protein. 
The N terminus contains a number of short repeating elements and possesses an unusual amino acid 
composition. 

Analysis of the nucleotide and deduced amino acid sequences indicates that three additional proteins could 
possibly be expressed, two of which might be initiated on internal ATG codon s and a third might be formed 
by alternative splicing. One of these proteins is supposed to be imported into mitochondria. Possible functions 
of the SUP2 gene product(s), especially its putative activity as a soluble factor controlling the fidelity of 
translation, are discussed. 



INTRODUCTION 

Studies of informational suppression have proved 
to be useful in elucidating the mechanisms of control 
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gallon factor: kb, kilobases or 1000 bp; MBN, mung-bean nucle- 



of translation aJ fidelity. In all cases studied so far 
informational suppression results from mutational 
alterations in the components of protein synthesis 
apparatus — usually either tRNAs, or ribosomal 
constituents (Ozeki et al., 1980; Sherman, 1982; 
Dequard-Chablat eta!., 1986; Steege and Soil, 

ase; MBN buffer, see MATERIALS AND METHODS, scctior 
b; nt, nucleotide(s); ORF, open reading frame; Pipes, piperazine 
/V^'-bisI2-ethanesulfonic acid]; Pollk, KJenow (large) fragment 
of E. coli DNA polymerase \;S\ buffer, see MATERIALS ANC 
METHODS, section b; tRNA. transfer RNA; u, unit(s). 
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1979). Recently, nonsense-suppressor niutaiions in 
/w/" genes coding for EF-Tu have been described in 
Escherichia coli (Vijgenboom ct ai., 1985). However, 
similar mutations in eukaryotes have not ye! been 
reported. 

For the past several years we were studying reces- 
sive omnipotent suppressors in yeast. It was shown 
that mutations in the genes named SUPl {SUP45) 
and SIJP2 (SUF35) give rise to a variety of pleio- 
tropic efTecls, including temperature sensitivity, drug 
sensitivity and respiratory deficiency. From these 
observations we concluded that the suppressor 
genes are essential for viability. Biochemical analyses 
indicates that suppressor mutations decrease the 
accuracy of translation and afTect protein synthesis 
both in the cytoplasm and in mitochondria 
(Surguchov et al., 1984). 

Recently, both the SUPJ and SUP2 genes \vere 
cloned (Breining et al., 1984; Telckov et al., 1986) 
and the nucleotide sequence of the SUP! gene was 
determined (Breining and Piepersberg, 1986). In this 
paper we report the nucleotide sequence of the SUP2 
gene. Part of the sequence shov/s significant 
homology to yeast EF-Lr, suggesting that the SUP.l 
gene product is not a canonical ribosomal protein, 
but rather a soluble translation factor. This essential 
protein appears to be present in minor quantities and 
probably has not b^:en detected by biochemical 
methods. Fuither characterization of its role ma)' 
reveal new essential features of the eukaryoiic tranr.- 
lation machinery. 



NJ.\TERIALS AND MCTHOD5: 

(a) Siihcloninf; and seqi^encing 

A shuttle plasmid pSTR4 containing SUP2 gene 
(Telckov ei ai.. 1986), v/as used for the sequence 
determination. A set of subclo nes of .SUP2 2cn- 
sufficient for sequencing was obtained in two steps: 
(/) restriction fragments of SUP2 were cloned into 
MI3mp phages (M13mplO, II, 18, 19), and (//) in 
some cases subclones were further deleted using 
DNaso f, as described (I.in et al . 1984). DNA 
restriction, ligation and other enzymatic treatments 
were carried out rccordin;; to the suppliers* specifi- 
cations (Pharmacia P-L Biochemicals). Transfor- 



mation of E. coli (strain JMIOI) by M13 phage and 
purification of recombinant phage were done accord- 
ing to Messing ( 1 983). The nucleotide sequence was 
determined using the dideoxy method of Sanger et al. 

(1977). 

(b) Yeast RNA analysis 

Preparation of total yeast RNA was performed, as 
described by Cottrelle et al. (1985). Forlhe Northern 
analysis, 20 of total RNA were glyoxylaled, elec- 
trophoresed on an agarose gel and transferred to 
nitrocellulose (Maniatis et ah, 1982). RNA blots 
were hybridized with strand-specific M13 probes, 
which were prepared according to Messing (1983). 

To obtain a single-stranded ^^P-labelled probe for 
5 -end mapping, an MI3 clone containing fragment 
Kpnl-Bcnl (bp 164 to -205, see Fig. 2) with thcKpnl 
site proximal to a sequencing primer site was used. 
The probe was synthesized by the primer extension 
with Foilk, cleaved at the 3' end with Bcn\ and 
separated from the template using a 5% poly- 
acrylamidegel (Leerct al., 1984). Hybridization was 
performed for 6 h at 46'Cin 80% forpiamide, 0.4 M 
NaCI, 0.4 M Pipes (pH 6.5) and 1 mM EDTA. Total 
yeast RNA (50 /^g) and 100000 cpm of probe were 
used for each experiment. Then hybridization 
mixtures were diluted ten-fold with SI or MBN 
bufTer (30 mM Na • acetate, pH 4.6, 1 mM ZnSO^, 
250 mM (50 mM for MBN buffer) NaCI, 20 /ig/ml 
of sonicated and denatured calf thymus DNA), 
supplemented with 1000-4000 u/ml of the appro- 
priate nuclease and digested for 30 min at 37 ^X. 
After chloroform extraction the protected DNA frag- 
ments were precipitated with isopropanol in (he 
presence of carrier tRNA and analyzed on a 5% 
polyacrylamide, 7 M urea sequencing gel. 



RESULTS AND DISCUSSION 

(a) Nucleotide sequence 

F^or sequence analysis, a shuttle plasmid pSTR4 
(Tcickov et al., 1 986) carrying a minimal fragment of 
the cloned yeast genomic DNA complemenling tem- 
perztur^-sensitive sup2 mutation was used. The 
restriction map for this fragment and sequencing 
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Fig. I. Restriction map and sequencing strategy for the SUP2 gene. Position and orientation of the ORF is represented by the arrowed 
open bar. Small arrows indicate the direction and extent of sequence determination on individual clones. 



Strategy are shown in Fig. I . A contiguous sequence 
of 3320 nt containing a single iong ORF of 2055 nt 
capable of coding for a protein of 76545 Da was 
determined (Fig. 2). 

(b) mRNA analysis 

Hybridization of total yeast RNA with a single- 
stranded M13 probe containing a PstUXhal frag- 
ment revealed a single band of 2.3 kb. Tlie opposite 



strand of the same fragment did not hybridize with 
RNA. By SI and MBN mapping two major tran- 
scription start points were found at nt positions -15 
md "37, as well as two minor sites at nt positions 
-57 and -43 (Fig. 3), before the first ATG codon in 
the ORF. Taking into consideration that the size of 
the transcript is 2.3 kb, we conclude that the tran- 
script contains the full length of the ORF. The first 
ATG in the ORF is present on the transcript and 
therefore il is the most probable initiator of trans- 



TABLE I 



Codon usage in the SUP2 gene 



aa 


Codon** 




aa 


Codon 




aa 


Codon 




aa 


Codon 




Phe 


TTT 


9 


Ser 


XCT 


12 


Tyr 


TAT 


10 


Cys 


TGT 


4 


Phe 


TTC 


7 


Ser 


TCC 


7 


Tyr 




25 


Cys 


TGC 


1 


I^u 


TTA 


7 


Ser 


TO A 


' 6 


ter^ 


TAA 


1 


ter 


TGA 


0 


Leu 


TTG 


15 


Ser 


TCG 


3 


ter 


TAG 


0 


Trp 


TGG 


4 


Leu 


CTT 


3 


Pro 


COT 


10 


His 


CAT 


7 


Arg 


CGT 


6 


Leu 


CTC 


0 


Pro 


CCC 


2 


His 




6 


Arg 


CGC 


0 


Leu 


CTA 


7 


Pro 


CCA 


18 


Gin 


£AA' 


40 


Arg 


CGA 


0 


Leu 


CTG 


3 


Pro 


CCG 


0 


Gin 


CAG ; 


13 


Arg 


CGG 


0 


He 


ATT 


17 


Thr 


ACT 


14 


Asn 


AAT 


24 


Ser 


AGT 


5 


lie 


ATC 


12 


Thr 


ACC 


16 


Asn 


AAC 


21 


Ser 


AGC 


2 


He 


ATA 


3 


Thr 


ACA 


8 


Lys 


AAA 


28 


Arg 


AGA 


11 


Met 


ATG 


19 


Thr 


ACCj 


I 


Lys 


AAG 


?S 


Arg 


AGO 


I 


Val 


SIX 


26 


Ala 




20 


.\sp 


GAT 


21 


Gly 


GGT 


45 


Val 


GTC 


10 


Ala 


occ 


16 


Asp 


GAC 


9 


Gly 


GGC 


10 


Val 


GTA 


9 


Ala 


GCA 


7 


Giu 


GAA 


44 


Gly 


GGA 


3 


Val 


GTG 




Ala 


GCG 


0 


Gill 


GAG 


13 


Gly 


GGG 


2 



* Codons, preferred in highly exprc; ;;ed yeast gen^s, arc underlined. 
** Symbol ter represents translations! rtop coc^onL;. 
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agaaattaaagctactta 

-720 caacaacggtctactacaaa;itaaggtgcct.\aaattgtcaatoacactcaaaagc[:gaagccaa,\aaagaggatcgccattgaccaaatacccgacc^^ 

-600 TCCCAACCCTACGGTAGAAAATTGA.;TATCGTATCTCmATACACACATACATACAnTATATnATAATAAGCGTTAAAATrrCCGCAGAATATCTGTCAACCACACAA^ 
-480 AACGAATGGTATATGCTTCATTTCTTTGTTTCGCATTACCTGCGCTATTTGACTCAAATTATTATTTTTTACTAAGACGAC^^^ 

-360 TCTCTTAAACCACTTCATAAAGTTGTCAAGTTCATAGCAAAATTCTTCCCCAAAAAGATGAATCTTAGTTCTCAGCCCACCAAAAGAGGTACATGCTAAG 

Ben I 

-240 AGTTCTTACCTTGCTCTTAAArr.TACATTACAACCOGGTATTATATCTTACATCATCGTATAATATGATCTTTCTTTATGGAGAAAATTTTTTmCACTC^ 

-120 TCTOAAGAGTGTAGTGTAlWn;GGTACATCTTCTCTTGAAAGACTCCATTGTACTCTAACAAAA^ 

i ATG TCG GAT TCA AAC CAA GCC AAC AAT CAG CAA AAC TAG CAG CAA TAG AGC CAG AAC GGT AAC CAA CAA CAA GGT AAC AAC AGA TAG CAA 
1 Hf.t. Ser Asp Srr Asn Gin Gly Asn Asn Gin Gin Asn Tyr Gin Gin Tyr Scr Gin Asn G]y Asn Oln Gin GJn Gly Asn A«n Arg Tyr Gin 
Hind ni PbI I Kpn I 

91 GGT TAT CAA GCT TAG AAT GCT CAA GCC CAA CCT OCA GGT GGG TAC TAC CAA AAT TAC CAA GGT TAT TCT GCG TAG CAA CAA GGT GGC TAT 
31 Gly Tyr Gin Aja Tyr Asn Ala Gin Ala Gin Pro Ala Gly Gly Tyr Tyr Gin Asn Tyr Gin Gly Tyr Ser Gly Tyr Gin Cln Gly Gly Tyr 

181 CAA CAG TAC AAT CCC GAC GCC GGT TAC CAG CAA CAG TAT AAT CCT CAA GGA GGC TAT CAA CAG TAC AAT CCT CAA GGC GGT TAT CAG CAG 
61 Gin Cln Tyr Asn Pro Asp Ala Gly Tyr Cln Gin Gin Tyr Asn Prn Gin Gly Gly Tyr Gin Gin Tyr Asn Pro Gin Gly Gly Tyr Gin Gin 

27 J CAA TTC AAT CCA CAA GCT GGC CGT GGA AAT TAC AAA AAC TTC AAC TAC AAT AAC AAT TTG CAA GGa'^TAT^AA GCT GGT TTC CAA CCA CAG 
91 Gin Phe Aan Pro Gin Gly Gly Arg Gly Asn Tyr Lys Asn Phe Asn Tyr Aen Asn Aan Leu Gin Gly Tyr Gin Ala Gly Phe Gin Pro Gin 

361 TCT CAA GQlATfi JCT TTG AAC GAC TTT CAA AAG CAA CAA AAG CAG CCC GCT CCC AAA CCA AAG AAG ACT TTG AAG^ CTT OTC TOO AGT TCC 
121 Ser Gin Gly Met Ser Lftu Asn Asp Phe Gin Lys Gin Cln Lys Gin Ala Ala Pro Lys Pro Lys Lys Thr Leu Lya Leu Val Ser Ser Ser 

451 GGT ATC AAC TTG GCC AAT GCT ACC AAG AAG GTT GGC ACA AAA CCT GCC GAA TCT GAT AAG AAA GAG CAA GAG AAG TCT GCT GAA ACC AAA 
151 Gly He Lys Leu Alu Aan Ala Thr Lys Lys VaJ Gly Thr Lys Pro Aia Glji Ser Asp Lys Lys Glu Glu Glu Lye Ser Ala Glu Thr Lye 

541 GA.\ CCA ACT AAA GAG CCA ACA AAG GTC GAA GAA CCA GTT AAA AAG GAG GAG AAA CCA GTC CAG ACT GAA GAA AAG AGO GAC GAA AAA TCG 
181 Glu Pro Thr Lys Glu Pro Thr Lys Val Glu Glu Pro Val Lys Lys Glu Glu Lya Pro Vol Gin Thr Glu Glu Lys Thr Glu Glu Lye Ser 

631 GAA CTT CCA AAG GTA GAA GAC CTT AAA ATC TCT GAA TCA ACA CAT AAT ACC AAC AAT GCC AAT GTT ACC AGT GCT GAT CCC TTG ATC AAG 
211 Glu Leu Pro Lys Veil Gl« AKp Leu Lys He Ser Glu Ser Thr His Asn Thr Asn Asn Ala Asn Val Thr Ser Ala Asp Ala Leu He Lye 

721 GAA CAG GAA GAA GAA GTG GAT GAC GAA GTT GTT AAC CAT ATG TTT GGT GGT AAA GAT CAC GTT TCT TTA ATT TTC ATC GGT CAT GTT CAT 
241 Glu Gin Glu Gju GJu Val Asp Asp Glu Va) Val Asn Asp Met Phe Gly Gly Lys Asp His Val Ser Leu lie Phe Met Gly Hie Vol Asp 

en GCC GGT AAA TCI ACT ATC GGT CCT AAT CTA CTA TAC TTG ACT GGC TCT GTG GAT AAG AGA ACT ATT GAG AAA TAT GAA AGA GAA CCC AAG 
271 Aia Gly Lys Ser Thr Mel Gly Gly Asn Leu Leu Tyr Leu Thr Gly Ser Val Asp Lys Arg Thr lie Glu Lys Tyr Glu Arg Glu Ala Lys 

901 GAT GCA GGC AGA CAA GGT TUG TAC 7TG TCA TGG GTC ATG GAT ACC AAC AAA GA.^ GAA ACA AAT GAT GGT AAG ACT ATC GAA GTT GCT AAG 
301 Asp Ala Gly Arg Gin Gly Trp Tyr Leu Ser Trp Val Met Asp Thr Asn Lys Glu Glu Arg Asn Asp Gly Lys Thr He Glu Val Gly Lys 



lation. It is interesting to note that sequences 
surrounding the two major transcription start points 
are similar to each other (Fig. 4). 

(c) 5'- and 3' -flanking regions 

A promoter element TATATT is located in a 
position typical for such elements in yeast, i.e., 
between bp -105 and -98 before the first ATG. The 
sequence AATAAA, which is thought to be a 
eukaryolic poiyadenylalion signal (Fitzgerald and 
Shenk, 198 1 X is situated 84-89 bp dov nstream from 
the terminating TAA. The sequence 
TAG.. -T AGT ..TTT, a potential transcription ter- 
mination signal in yeast (Zaret and Sherman, 1982), 
was found 115-M.l bp downstream from the termi- 
nation codon TAA. An interesting feature of the 
3 '-flanking region ir; the presence of the repeats 
(TA), , (95 bp downstream from TAA) and (CAT), , 
(350 bp downstream from TAA). 



(d) Ccdon usage 

The SUF2 gene differs markedly in codon usage 
from highly expressed yeast genes, particularly ribo- 
soma] protein genes (Table I). The codon bias index 
according to Bennetzen and HaD (1982) was deter- 
mined to be 0.42, whereas the range of values for 
ribosomal proteins is 0.79-0.94 (Sharp et al., 1986). 
Such a difference could mean, according to estimates 
given by Bennetzen and Hall (1982)^ that SUP2 
mRNA is at least an order of magnitude less 
abundant than mRNAs of ribosomal protein genes. 

(e) Deduced amino acid sequence 

A part of the amino acid sequence beginning with 
Met-254 (the third methionine in the sequence) is 
homologous to the full length of yeast EF-Ia 
(Kushnirov et al., 1987). The remaining N-terminal 
part can be divided near the second methionine into 
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991 


GCC 


TAC 


TTT 


GAA 


ACT 


GAA 


AAA 


ACC 


CGT 


TAT 


ACC 


ATA 


TTG 


GAT 


GCT 


CCT 


GGT 


CAT 


A,U 


ATG 


TAC 


GTT 


TCC 


CAC 


ATG 


ATC CGT CGT GCT TCT 


331 


Ala 


Tyr Phe Glu 


Thr 


Glu 


Lys 


Arg Arg Tyr 


Thr 


lie 


Leu 


Asp Ala 


Pro 


Gly 


Kis 


Lys 


Met 


Tyr 


Val 


Ser 


Glu 


Met 


He Gly Gly Ala Ser 


loai 


CAA 


OCT 


GAT 


GTT 


GGT 


GTT 


TTC; 


GTC 


ATT 


TCC 


GCC 


AGA 


AAG 


GGT 


GAG 


TAC 


GAA 


ACC 


GGT 


TTT 


GAG 


AGA GGT 


GGT 


CAA 


ACT CGT GAA CAC GCC 


361 


Gin 


Ala Asp Val 


Gly 


Vnl 


Leti 


7,1 1 


lie 




Ala 


Arg 


Lys 


Gly Glu 


Tyr 


Glu 


Thr 


Gly 


Phe 


Glu 


Arg Gly 


Gly Gin 


Thr Arg Glu His Ale 


1171 


CTA 


TTG 


GCC 


,VAG 


ACC 


CAA 


GCT 


GTT 


AAT 


AAG 


ATG 


GTT 


GTC 


GTC 


GTA 


AAT 


A.\G 


ATG 


GAT 


GAC 


CCA 


ACC 


GTT 


AAC 


TGG 


TCT AAG GAA CGT TAC 


391 


Leu 


Leu 


Ala 


Lys 


Thr 


Gin 


Gly 


Val 






Met 


Val 


Val 


Val 


Val 


Asn 


Lys 


Met 


Asp 


Asp 


Pro 


Thr 


Val 


Asn Trp 


Ser Lys Glu Arg Tyr 




GAC 


CAA 


TGT 


GTG 


ACT 


AAT 


GTC 


AGC 


AAT 


TTC 


TTG 


AGA 


GCA 


ATT 


GGT 


TAC 


AAC 


ATT 


Ai^G 


ACA 


GAC 


GTT 


CTA 


TTT 


ATC 


CCA GTA TCC GGC TAC 


421 


Asp 


Gin 


Cy3 


Vnl 


Ser 


Asn 


Val 


Ser 






Leu 


Arg 


Ala 




Gly 


Tyr 


Asn 


He 


Lys 


Thr 


Asp 


Val 


Val 


Phe 


Met 


Pro Val Ser Gly Tyi 


1351 


ACT 


GGT 


GCA 


AAT 


TTG 


AAA 


GAT 


CAC 


GTA 


GAT 


CCA 


AAA 


GAA 


TGC 


CCA 


TGC 


TAC 


ACC 


GGC 


CCA 


ACT 


CTG 


TTA 


GAA 


TAT 


CTG CAT ACA ATG AAC 


451 


Ser 


Gly Ala Asn 


Leu 


Lys 


Asp Kis 




Asp 


Pro 


Lys 


Glu 


Cys 




Trp 


Tyr 


Thr 


Gly 


Pro 


Thr 


Leu 


Leu 


Glu 


Tyr 


Leu Asp Thr Met Asr 






Sail 






































Kon T 








1441 


CAC 


GTC 


GAC 


COT 


CAC 


ATC 


AAT 


GCT 


CCA 


rrc 


ATG 


TTG 


CCT 


ATT 


GCC 


GCT 


AAG 


ATG 


AAG 


GAT 


CTA 


GGT 


ACC 


ATC 


GTT 


CAA GGT AAA ATT GA/ 


481 


Rls 


Val 


Asp Arg 


His 


rie 


Asn 


Ala 






Met 


Leu 


Pro 


r I e 




Ala 


Lys 


Met 


Lys 


Asp 


Leu 


Gly 


Thr 


He 


Val 


Glu Gly Lys Ue Gli 


1531 


TCC 


GGT 


CAT 


ATC 


AAA 


AAG 


GGT 


CAA 


TCC 


ACC 


CTA 


CTG 


ATG 


CCT 


AAC 


AAA 


ACC 


GCT 


GTG 


GAA 


ATT 


CAA 


AAT 


ATT 


TAC 


AAC GAA ACT GAA AAl 


511 


Ser 


Gly His 


lie 


Lya 


Lys 


Gly Gin 






Leu 


Lou 


Mot 


Pro 


Asn 


Lys 


Thr 


Ala 


Val 


Glu 


He 


Gin 


Asn 


He Tyr 


Asn Glu Thr Glu Aat 


•621 


GAA 


GTT 


GAT 


ATG 


GCT 


ATG 


TGT 


GGT 


GAG 


CAA 


GTT 


AAA 


CTA 


AGA 


ATC 


AAA 


GGT 


GTT 


CAA 


GAA 


GAA 


GAC 


ATT 


TCA 


CCA 


GGT TTT GTA^TA AC; 


541 


Glu 


Val 


Asp 


Met 


Ala 


Met 


Cys 


Giy 






Vn 1 


Lys 


Leu 


Arg 


He 


Lys 


Gly 


Val 


Glu 


Glu 


01 U 


Asp 


He 


Ser 


Pro 


Gly Phe Vaf "Leu Thi 


1711 


TCG 


CCA 


AAG 


AAC 


CCT 


ATC 


AAG 


ACT 


GTT 


ACC 


AAG 


Trr 


GTA 


GCT 


CAi\ 


ATT 


GCT 


ATT 


GTA 


GM 


TTA 


AAA 


TCT 


ATC 


ATA 


GCA GCC GGT TTT TCJ 


571 


Ser 


Pro 


Lya 


Ar,n 


Pro 


lie 


Lys 


Ser 


Val 


Thr 


Lys 


Phe 


Vaf 


Ala 


Gin 


He 


Ala 


He 


Val 


Glu 


Leu 


Lys 


Set 


He Tie 


Ala Ala Gly Phe Sei 


ISOl 


TGT 


GTT 


ATC 


CAT 


GTT 


CAT 


ACA 


OCA 


ATT 


GAA 


GAG 


GTA 


CAT 


ATT 


GTT 


AAG 


TTA 


TTG 


CAC 


AAA 


TTA 


GAA 


AAG 


Kpn I 

GGT ACC 


AAC CGT AAG TCA AA< 


601 


Cys 


Val 


Met 


His 


Val 


His 


Thr 


Ala 


Tie 


Glu 


Glu 


Val 


His 


Me 


Val 


Lys 


Leu 


Leu 


His 


Lys 


Leu 


Glu 


Lys 


Gly 


Thr 


Asn Arg Lys Ser Lyi 


1890 


AAA 


CCA 


CCT 


GCT 


TTT 


GCT 


AAG 


AAG 


GGT 


ATG 


AAG 


GTC 


ATC 


GCT 


GTT 


TTA 


GAA 


ACT 


GAA 


GCT 


CCA 


GTT 


TGT 


GTG 


GAA 


ACT TAC CAA GAT TA( 


63 J 


Lys 


Pro 


Pro 


Ala 


Phe 


Ala 


Lys 


Lys 


Gly 


Met 


Lys 


Val He 


Ala 


Val 


Leu 


Glu 


Thr 


Glu 


Ale 


Pro 


Va.t 


Cys 


Val 


Glu 


Thr Tyr 01 n Asp Tyi 


1980 


CCT 


CAA 


TTA 


GGT 


AOA 


TTC 


ACT 


TTG 


AG A 


GAT 


CAA 


Kpn 1 

GGT ACC 


ACA 


ATA 


GCA 


ATT 


GCT 


AAA 


ATT 


GTT 


AAA 


ATT 


GCC 


GAG 


TAA ATTTCTTGCAAACA* 


f>61 


Pro 


GlD 


Leu Gly 


Arg 


Phe 


Thr 


Leu 


Arg Asp 


Gin 


Gly 


Thr 


Thr 


He 


Ala 


He 


Gly 


Lys 


lie 


Val 


Lye 


Tie 


Ala 


Glu 





2073 AAGTAAATGCAAACACAAfAATACCGATCATAAAGCATTTTCTTCTATATTA,^AAAACAAGGTTTAATAAAGCTGTTATATATATATATATATATATAGACGTATAAT TAGTT rAGTT^ 

Xba I 

2193 mTGTACCATATACCAT.UACAAGGTAAACTTCACCTCTCAATATATCTAGAATTTCATA/VAAATATCTAGCAAGGTTTCAACTCCTTCAATCA^ 

2313 COTTATTTCAGAATGTGCAjUATCTATTAGTGACATGGAiCTCAAAGAACCACTTGTTTTTTTGTCCTTTGGTCCrrCGCTGCTTCCCTCGGCATCATCATCATCATCATCATCATTAT" 



2433 ATCATCGTCGTCATCATCGTCTATAAAATCATCTCGC,^T^AGTTTGT(:AACATC^TTTA(;TA&VTCCCATCGCTCCGGGTCTCCTTCGTAAATAAACAAAAGACTACTTGATATCATTC' 
2553 AACTTCTTCTTCTAGCATAGTArTATAAAA 

Fig. 2. Nucleotide sequence and deduced amino acid sequence of the SUF2 geue. The location of restriction sites is indicated. Sequent 
elements TAT ATT, AAT AAA and TAG...TAGT.. TTT (Zaret and Shennan, 1982), which may be relevant for initiation or tenninatior 
of the transcription, are imderlined t y solid lines. Major and minor transcription slart points are marked by downward arrows 
Underlined by dashed lines are: HOMOLMike sequence, the second and third in-frame ATG codons and sequences GTATGT anc 
TACTAAC typicaJ for yeast introcs. Second ATG and the GTATGT sequence do overlap. 



two fragments, both having an unusual amino acid 
composition (Table 11; Fig. 5), 

Region A is a region of 123 aa, beginning at the 
first methionine and contains repeats of three 
sequence elements, which m?ke up most of its length 
(Fig. 6). Sequence Gh3-Gh'-G)y.-Tw-Gln'(Gln)- 
Gln-Tyr-Asn-Pro is repeated about svi tinies 
(Fig. 6b). This region is nch in Gin (28%), Gly 
(17%), Asn (16%) and Tyr (]6%X all four amino 
acids making up 78%. 

Region B is a region of aa 1 24-253 nch in charged 
amino acids, Lys (18%) and Glu (]?<%), which ms.y 
be further subdividfid into four stretches: (7) a 
stretch of aa 124-164 is positively charged and 
resembles the signal sequences for mitochondri -il 
import (von Heijne, ;.976; see nnsuLTS. section g, 
for details); (2) a stretch of aa 165-222 contains 



several repeats of tetrapeptides : Lys-Lys-Glu-GIu 
Thr-Lys-Giu-Pro, Glu-Glu-Lys-Ser, Thr-GIu-Glu 
Lys; (3) a stretch of aa 223-235 does not contaii 
charged residues; (4) a stretch of aa 236-253 carrie 
a negative chfirge (9 aa residues out of 18 are Asp o 
Glu), 

l\ is it:teresting that region E contains 24 Lys, bu 
does not concain Arg. 

(f) Poss^ible existence of additional SUP2 gene pro 

dmis 

R^h\ analysis revealn a single transcript for th 
SUP2 gcno con-.a'ning a complete ORF. However, 
detailed nna-ysis of the nucleotide and deduce< 
amino acid s-^qnences points to the possibility c 
•jv^nsnce of shorter transcripts and correspondin 
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protein products. One may suggest a possibility of 
translation initiation on the second and third ATG 
codons as well as excision of the part of the coding 
sequence resulting from splicing, as it is shown in 
sections g-i, belovv. 




1 2 3 4 5 6 7 



Fig. 3. Mappi he 5' end of the SUP2 mRNA. Tr i eust 

RNA was hybn • :d to a singie-stranded ^-P-labeled A>fl i:n\ 

fragmciu (nt i( jj -205) at in m% formamic;..- ; iid 

treated with S nuclease in the ibliowin^ coiicenf m5: 

lanes: 2. 2000 . "SI; 3, 4000 u/n:l jf SI; 4, 1000 ' o; 

MBN;5,4000 ..fMBN, 6. control without yeast RN ' tlOO 

u/ml of S 1 . A c' • y sequencing )ane of a known seqm, vas 

used as a mar" J'anes 1 and 7). Tlie transcription st » ints 

arc indicated ' :Owhcads. 



(g) Initiation of the translation on the second in- 
frame ATG 

As we have shown earlier, many sup2 mutations 
cause a respiratory deficiency, reduction in mito- 
chondrially synthesized cytochrome content and 
decrease in the' rate of protein synthesis in mito- 
chondria. These data allowed to predict the existence 

-37| 

TCGACTTGCTCGGAA 
Consensus; T AYYTGCYCR A 
TATATCTGCCCACTA 
-15t 

Fig. 4. Similarity of the two major transcription start point 
regions. Transcription start points are marked by arrows. R 
designates purine (A or G), Y designates pyrimidine (T or C), The 
upstream sequence is on top. the downstream sequence is at the 
bolEoin (sec Fig. 2), and the consensus sequence is in the middle 
line. 



TABLE II 

Amino acid composition of the SUP2 gene " 
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253- 
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28 


43 
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2 


0 
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Asn 


20 


7 


18 


45 
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.2 


7 


21 


30 


Cys 


0 


0 
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5 


Gin 


35 


6 


\2 


53 


Giu 


0 


23 


34 


57 


Gly 


21 


2 


37 


60 


His 


0 


1 


12 


13 


lie 


0 


3 


29 


32 


Leu 


1 


7 


27 


35 


Lys 


1 


24 


41 


66 


Met 


1 


1 


17 


19 


Phe 


3 


1 


12 


16 


Pro 


6 


g 


16 


30 


Ser 


5 


10 


20 


35 


Thr 


0 


11 


28 


39 


Trp 


0 


0 


4 


4 


Ty- 


m 


0 


15 


35 


Val 


0 


10 


40 


50 


Total 


123 


130 


432 


685 



"Amino acid composition of regions A (aa 1-123), B 
(a;i 124-253), E Oa 254-685) and the entire SaP2 protein is 
shewn. Unusually high content of some amino acids is under- 
lined. 
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ATG GTATGT \TG 

^ 



•1) 



-COOH 



100 200 300 400 500 600 6 85 

Fig. 5. Schematic representation of the predi primary structure of the SUP2 prrtiin. Segment (A) represents a region containing 
extensive repeats of 8-lO-bp sequence elemei {sfiown by short horizonia) arrows), and rich in Gin (28%), Gly (i7%), Asn (16%) 
and Tyr(l6%). Segment (B) represents a regie; v ith high content of charged residues Lys (18%) and Glu (18%). It may be subdivided 
into four stretches :( I ) a positively charged strt *f ^ similar to signal sequences for mitochondrial import {von Heijne, 1986); (2) a stretch 
containing several tetrapeptide repeats rich t • • ^ and Glu; (3) a stretch without ch'.,rj\ d residues; (4) a stretch carrying a negative 
charge. Segment (E) represents a region home ^ o js to the ftjil length of yeast EF-I x 11^ v. sites corresponding to the first, second and 
third ATG codons are denoted by downward u jws. Putative splice sites are marked b/ triangles. The scale below the map is in aa. 
(See also Fig. 2 and Table I! ) 



o(aSUP2 gene product, which may br imported into 
mitochondria (Surguchov ct al., 1984 hiosx of such 
imported proteins have a signal seqiu. e at tlieir N 
termini, which is positively charged - ..bV to form 
an amphiphilic helix (von Heijnc, 19 v3). A similar 
sequence element is present in a single site in the 
SUP2 protein after the second m,^nionine (aa 
124-164). 

Conserved 12-bp sequences, Hf>-I0L1 and 
RPG, are present in the 5'-flanking rcn ras of most 
of yeast ribosomal protein genes (Teei :'t al,, 1984; 
Leer etal., 1985). The sequence HO'^IOLI is also 
found in 5 '-flanking regions of genes ding EF- 1 a 
(Huel etal., 1985) and in the SUPl .ne (^reining 
and Piepersberg, 1986). It has been [ nrc^cd, that 
these sequences are required for the l arscdptiorirj 
regulation of the components of the (ryj-sianonal 
apparatus (Huet et al., 1985). The HO'»'j OL1 boy. is 
usually located before the TATA box at a distance 
of 150-400 bp upstream from the tnui: *■ piion st'drt 
point. In the SUF2 gene a scquerr- AACATC- 
TATATC similar to the HOVtO' I sequence, 
AACATC(T/C)(G/A)T(A/G)CA, is -^s^nt. How- 
ever, since this sequence is situated ? presiutied 
TATA box at nt positions -27 to -16 ^rvivc the fi^rt 
ATG codon, it is possible that ii: regui • nitiation 
of transcription at a site before th^ se: ^n:i ATC 
corresponding putative TATA box is !o*,rttr.d 15D bp 
upstream from the second ATG. 

(h) Initiation of translation on the tlv\ :> *n-fram? 
ATG 

Upon alignment of homolojious regi ^is ofaniitio 
itcid sequences of EF-la and the ::ro':.Hn- 



initiating mcihionine of the EF-la corresponds 
exactly to th : rhird methionine in the SUP2 protein, 
thus indicalm*-: possible involvement of the latter in 
the initial icr iranslation. The following observa- 
licn confinris this suggestion. Upon deletion from 
the SUP2 gene of a restriction fragment HindlU- 
Hindlll (nt 99-434), the second in-frame ATG is 
removed and tS^e reading frame beginning from the 
first ATG is dirr.ipted. However, high copy number 
piasmids conta ning such deletions still complement 
certain tcmp?:r;Uu re- sensitive sup2 mutations. This 
result can hr explained only by the existence of a 
protein initivici on the third in-frame ATG. The 
c?.lcuisled /'fj of this protein is 48039. 

A minor -nRNA b?nd of 1.4 kb, hybridizing to the 
coding seyr.en. of the SUP2 gene, which has been 
observed by Surguchov etal. (1986), may corre- 
spond to a trar. script initiated before the third ATG 
codon. Hcv/t^vcv, such a band was not found in this 
study. A posii'-i;' explanation for this discrepancy is 
that r.his tranrcrpt occurs ir^ relatively small amounts 
depending or nc conditions, used for growth. 

(i) Pnssil?*;!!'?.}' nf rJtenmtive splicing 

The Qonvr^^ ' js OP.F of the SUP2 gene contains 
seqvonci^s that ire typicel for introns in S. cerevisiae 
genes, incli-dir^* a con:pIcti;]y consen/ed sequence 
TAX7T.AAC, '^^ich is present in all yeast introns. 
Thi: sequence ir found in the SUP2 gene around bp 
1700. At Ihe 5' end of yeast introns a sequence 
GTATGT or If r.? frequently, GTACGT is located. 
Both of there s'^ yaejaces are found in the ORF of the 
SIJP2 gene near bp 364 and 1046, respectively. A 
trini'rleGii': ? T.-^ 3 (bp ^.748) tv:;iresr to the sequence 
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one I'li^' T" "^'"e N-.erminal region of the SUP2 protein, 

s are J '""in ap a,' """"" ''^'^ ''"'"^ ^"""^ P'^"^" Conservative am no 



capitals. 

.CTAAC may b? regarded as a 3' end of Ihis: 
>othetical intron. The first of two donor splice 
!S (GTATGT) seems to be a more likely candidate 
the 5' end of Uie :n*ron, because h\ this case thi.^ 
ding frame is not shorted by splicing. Location of 
> site is not random. The GT ATG T sequence 



covers the second ATG codon and the border of the 
A and B regions of the deduced polypeptide (Fig. 5). 
The size of the protein product corresponding to 
spliced mRNA would be 25 kDa. 

A large and functionally important pan of the 
sequence lies inside the proposed intron, including 
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an assumed signal for mitochondriai import and 
domains, for which participation in GXP- and 
aminoacyl-lRNA-binding is predicted due to 
homology with EF-la (Kushnirov et aJ., 1987). ^fliis 
indicates that the unspliced transcript must be 
expressed, even if splicing docs occur. Such alterna- 
tive splicing was not described in the yeast 
5. cerevisiae. 

Although we did not delect multiple transcripts of 
the SUP2 gene, their existence as v/ell as expression 
of the corresponding protein products cannot be 
excluded. 

(j) Possible functions of the SUF2 [Tfrotein 

The C-terminaJ part of the SUP2 protein, begin- 
ning from the third methionine (Met- 254), shows 
significant homology to yeast EV~hx as well as to s. 
family of analogous factors from other species 
(Kushnirov et al., 1987). The degree of amino acid 
homology amounts to 62%, considenng conserva- 
tive amino acid substitutions, as homologous . Fur- 
thermore, nonhomologous strtrlches of significant 
length are absent in this region. Iliis allows us to 
suggest that the S(JF2 protnin posse ir.e:;. ?.part froirj 
its N-terminal domains, the same functional 
domains as EF-la. including OTP- and aminoacyi- 
tRNA-binding domains, ;he degre^^; of 

homology is highest. One a^iigltt speculate then that 
these two proteins act at the same site on the ribo- 
some and that their rjiode of ac:ir,n i:; rather similar. 
However, it is important to emphasj.ze, that they are 
not interchangeable, since d^srupticn of the' SUP2 
gene is lethal (M.D.T.- A . and A.R. Dagcknsanians- 
kaya, in preparation). Furthermore, as po'-nted out in 
RESULTS, section iJ, the SIIP2 pvotdn ar'pfru:s to be 
much less abundant, than £F-Ia. Taken together, 
these and other data src coriSirt^i: with the assurap- 
tion that the product of the FUF2 g^rc is a soluble 
factor that participates in the -^ontrol of the fidelity 
Tof translation. 

In a well-studied translalion system cf E. roh, 
minor protein similar to the SUP7 pro'icin has not 
been found yet. At tiie san^e time, orc.nipoterit 
suppressor mutations in EF*Tu have been c'escribed 
fVijgenboom et al., 198.S). -.fr-reovcr. :\nil\sis of 
these mutants reveahd a verluctic:'' in X^tc accuracy 
of the protein synthesis nt br'i ^he [>nmary ^minc- 
acyl-tRNA selection r»rd th^ ^^i oo^r eac'm^ stores 



(Tctpio and Kurland, 1986). This allows us to suggest 
that EF-Tu, apart from a function analogous to 
£F-lc(, may also perfonn a proofreading function, 
for which the SUP2 protein is specialized in 

S. cerevisiae. 

To detennine the role o[SUP2 gene product(s) in 
protein synthesis it will be necessary to identify and 
purify the protein and study it biochemically. 
Genetic approaches and recombinant DNA tech- 
niques may also give valuable information, for 
example, examination of nucleotide substitutions, 
'ea»aing to suppression. 
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